-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory stats for cgroup v2 #2810
Conversation
In cgroup v2, MemoryStats.UseHierarchy has no meaning, since all resources are tracked as if UseHierarchy=true in v1. cgroup v2 also have some other names for the stats. For those interested, the mapping can be found here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/memcontrol.c#n4026
Hi @odinuge. Thanks for your PR. I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
/cc @harche |
Wow, looks great. Thanks @odinuge for this PR. I will test this with node e2e and report back. |
Have already tested. Together with the other changes to cadvisor, the results can be found here: https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/99269/pull-kubernetes-node-crio-cgrpv2-e2e/1363504000409800704/. Not sure what the Pagefault error is, but I don't think that is a problem with cgroup v2, but instead with the tests. The root memory is due to missing |
Ahh, the |
/cc @giuseppe |
thanks, what test is exactly failing because of crun? I can help and take a look. Do you know if it is caused by crun creating a subcgroup when systemd is used or is there any other reason? |
Yes, it is due to the For runc this "works" as expected, but since the device cgroup in runc is kinda broken for v2, runc doesn't work at all with kubernetes with cgroup v2... :/ |
could we read the cgroup from |
Hmm. This code (as a big simplification) loops through the cgroup hierarchy, and k8s selects the data it wants by using the cgroup of the container. I don't think k8s should deal with container_pids, that should be the job of the CRI runtime afaik. Not sure what the best solution is, other than using stats from CRI instead (have not checked if those work similar in cgroup v2 and crun as well)...
Yup. The systemd-implementation in k8s/cadvisor/libcontainer is fragile, and depends highly on the naming scheme... The runtime spec is kinda vague as well, but as I read it, it does give the guarantee either... When it comes to the systemd-docs, k8s and runc break most of those "don'ts" I guess 😅 edit: But, this discussion is probably outside the scope of this PR.. |
LGTM, thanks! |
In cgroup v2, MemoryStats.UseHierarchy has no meaning, since all
resources are tracked as if UseHierarchy=true in v1. cgroup v2 also have
some other names for the stats.
For those interested, the mapping can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/memcontrol.c#n4026